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"The One counts Himself, and no-one else counts Him, and He is every number. He is 

root, and foundation and square and cube, and He is like the essence that carries all the cases, and 

every number is in His power, and He is in every number in deed, and He is present, and every 

number is present because of Him, and He is Ancient, and every [other] number is [rejnewed, and 

He is the reason for every number, pair[even] and that is not pair, He is not a number, and will 

O ' not multiply and will not divide. " 

(N 

J2 ; - Abraham Ibn Ezra (1089-1164), Sefer HaEkhad ("Book of One") [I] 

Abstract: We use recurrence equations (ahas difference equations) to enumerate the number of 

formula-representations of positive integers using only addition and multiplication, and using ad- 

I dition, multiplication, and exponentiation, where all the inputs are ones. We also describe 

' efficient algorithms for the random generation of such representations, and use Dynamical Pro- 

r~| ' gramming to find a shortest possible formula representing any given positive integer. 

> ' 

Very Important: This article is accompanied by the Maple package 

http : //www. math. rutgers . edu/'zeilberg/tokhniot/ArithFormulas , 

y 1 
> 

lO ' and the output files that are linked to from the webpage ( "front" ) of this article 

OO 
00 

«) ' http : //www. math. rutgers . edu/~zeilberg/mamarim/maiiiarimhtnil/arif .html 

[ Prologue 

m 

According to conventional wisdom, the invention ("discovery") of zero was one of the greatest 
moments in the annals of mathematics. We respectfully disagree. The invention of zero was a great 
^ , disaster, that lead to the beginning of nihilism. Here we will show how it is possible to manage 

, very well without 0. 

Introduction 

Mark Twain once wrote a letter to a friend that started with 
"/ didn't have time to write a short letter so I wrote a long one ..." 

We mathematicians (and computer scientists) deal with numbers rather than words, but even the 
seemingly naive question of representing a positive integer as succinctly as possible is far from 
trivial. 

This interesting question was addressed in [GD] , where the systematic study of arithmetical formula- 
representation was initiated, and two natural ways, called there "the first canonical form" (FCF), 
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and the "second canonical form" (SCF) were introduced. The present article is a natural follow-up 
of [GD], but in order to make it self-contained, we will review the basic notions. 

We are all familiar with the "caveman's representation" of a positive integer by marking lines (only 
using I's), for example 

17 = 11111111111111111 , 

also called the unary representation. More "efficiently" we have the familiar decimal, 'positional' 
systems that, alas, needs ten symbols. The binary representation "only" uses two (one too many!) 
symbols, and 1, where, for example, seventeen is written as 10001, meaning 

17 = 1 • 2"^ + • 2^ • 2^ • 2^ -Fl • 2° . 

One can use the "sparse notation" by only keeping the I's 

17 = 2^ + 1 . 

and doing the same for the exponents 

17 = 2^' + 1 , 

and finally replacing 2 by 1 -|- 1 getting an expression that only uses 1 

17 = (l + l)(i+i)'^'+l . 

This lead (in [GD]) to the First Canonical Form. Another natural way is to use the Fundamental 
Theorem of Arithmetic and factor the integer into prime powers, and then either write each prime 
as a sum of I's and keep factorizing the exponents, or write a prime as 1 + (p — 1) and factorize 
p — 1 and continue recursively. This lead, in [GD], to the Second Canonical Form. 

Either way, the bottom line is an expression that only uses I's, plus ("+"), times ("*"), and 
exponentiation ( "A" ) . 

We will make the convention that 1 can never be an argument of either multiplication or exponen- 
tiation, or else there would be infinitely many ways of representing even 1. 

Given a positive integer n, how can we express it as a formula only using the operations {-|-, *, A} 
and the integer 1? [where we consider our operations as binary, i.e. fan-in 2] 

Of course there is only one way to express 1, namely, 1. There is also only one way to express 2: 

2=1+1 . 

[Strictly speaking we should write 2 = (1) -|- (1), but we will abuse notation and abbreviate (1) to 
!]• 
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There are exactly two ways to express 3 

3 = (1 + 1) + 1 , 3 = 1 + (1 + 1) . 
So far we only used addition. There are five ways to express 4 only using addition: 
1 + ((1 + 1) + 1) , 1 + (1 + (1 + 1)) , (1 + 1) + (1 + 1) , ((1 + 1) + 1) + 1 , (1 + (1 + 1)) + 1 . 

[In general there are C„ = (2n)!/(ra!(n + 1)!) ways of expressing n only using addition]. 
If you are also allowing multiplication, then we have, in addition (no pun intended) 

4 = (1 + 1)* (1 + 1) , 
and if you are also allowing exponentiation, we have 

4= (1 + 1) A (1 + 1) . 

The above are examples of formulas whose inputs are always I's. The easiest way to define a 
formula is via 'grammars'. If we only use addition, the additive formulas are given by the grammar 

F = l OR {F) + {F) , 

while the formulas that allow both addition and multiplication are defined by 

F = l OR {F) + {F) OR {F)*{F) , 

and if you also allow exponentiation, then the grammar is 

F = l OR {F) + {F) OR {F)*{F) OR (F) A (F) . 

The above format is infix. As is well known (especially to users of HP calculators) one can get rid 
of parentheses, using postfix (alias Reverse Polish) notation. The translation from infix to postfix 
is easy 

1 — >■ 1 , a + b ^ ab+ , a * 6 — >■ ab* , a Ab ^ ab A 
Of course these transformation rules are to be applied recursively. For example, the expression 

(1 + (1 + 1))A((1 + 1) + 1) 

(representing twenty-seven), is written in postfix notation as 

111 + +11 + 1 + A . 
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We have already mentioned that the number of expressions of n that only use addition, let's call 
it Ca(n), is the famous Catalan sequence (2n)!/(n!(n + 1)! (why?). Let Cam{n) be the number of 
such expressions that use both addition and multiplication, and Came{n) the number of expressions 
that use the full arsenal of addition, multiplication, and exponentiation. 

In this short article (accompanied by a very long Maple package, and even longer sample output 
files) we will answer the following questions. 

• How to compute the sequences Camin) and Camein) for as many n as possible ?(it is unlikely 
that there are closed-form formulas). 

• What is the asymptotics of Cam(n) and Came{i^) as n — >■ oo ? 

• How to draw uniformly at random, such an expression ? 

• How to find the shortest possible expression for a given integer n. Of course, if you only use 
addition all Ca{n) expressions have the same length 2n— 1, but of course if one allows multiplication 
one can get much shorter expressions, and if one also allows exponentiation, then one can get yet 
shorter ones. [The length of such a minimal expression may be called the computational complexity 
of the integer (w.r.t. the computational models discussed here)] 

Enumeration 

Only using addition 

Let Ca{n) be the number of expressions for the positive integer n only using addition. Such an 
expression may be written as n = k + {n — k) for some 1 < A; < n, and the number of these is 
Ca{k)Ca{n — k), so we have the non-linear recurrence 

n-1 

Ca{n) = Y,Ca{k)Ca{n-k) , C„(l) = l, 
fe=l 

whose solution is famously (2n)!/(n!(n-|-l)!), the ubiquitous Catalan sequence [S] http://oeis.org/A000108. 
Using addition and multiplication 

Let Cam{n) be the number of formula-trees with the leaves all I's that represent the integer n, and 
C^^(n) be the number of those whose root is -|-, and C^(n) be the number of those whose root 
is *. Then we have, of course 

Cam{n) = C:M + CZn{n) , 

and the non-linear recurrences 

n-1 
i=l 
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Ln/2J 
i>l,n/i integer 

[See procedures Cain(n) and CaiiiSeq(N) in ArithFormulas]. 

Using the Wilf-methodology [W][NW] we can use the remembered values of Camin) and C^^{n)^ 
^am ip) to generate uniformly at random such an expression. First use a loaded coin with probabil- 
ities (n) / Cam {n) , (n) / Cam (n) to decide whether the root-operation is "plus" or "times" , 
and in the former case use an n — 1-faced loaded die whose faces are labeled 1, ...,n — 1, and the 
probability of lending on i is C am{'i')C amii^ — and continue recursively for i,n — i as- 

suming that it landed on i. Similarly if the loaded coin decided that the root-operation is "*" then, 
create a loaded die whose faces arc labeled by the non-trivial divisors of n, and the probability of 
lending on face i is Cam{i)Cam{ri/i) /C'l^^{n) and continue recursively. 

[See procedures RaFainT(n) and RaFamP (n) in ArithFormulas]. 

Using addition, multiplication and exponentiation 

Let Camein) be the number of formula-trees, whose internal nodes are in {-|-, *, A} and whose 
leaves are all I's, that represent the integer n, and C"^g(n) be the number of those whose root is 
+' ^amei''^) be the number of those whose root is *, C^mei^) be the number of those whose root is 
A. 

Then we have, of course 

Came{n) = Cl^,{n) + ,(n) + Cl^^{n) , 
and the non-linear recurrences 

n-l 

^ame(^) ~ C ame{i)C ame{n — l) , 
i=l 

ln/2] 

C^eH = XI Came(i)Came{n/i) . 

i>l,n/i integer 

Camein) = ^ C ame{i)C ame{j) ■ 
ii =n,j>l 

[See procedures Came(n) and CaineSeq(N) in ArithFormulas]. 

Using the Wilf-methodology [W] [NW] we can use the remembered values of Came (n) and Came (") ' ^ame C*^) i ^ame C*^) 
to generate uniformly at random such an expression, in an analogous way to the addition-multiplication 
trees above. 

[See procedures RaFameT (n) and RaFameP (n) in ArithFormulas] . 
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Finding the Shortest Formula 



Using Dynamical programming we can find the shortest possible formula (measured in terms of 
length in postfix notation), in either categories. We look at all the possible root operations and 
their subtrees and pick the shortest possibility, using the previously obtained expressions for the 
children. 

[See procedures ShortestTam(n) , ShortestTame (n) for the shortest formulas in infix (tree) no- 
tation and procedures ShortestPain(n) , ShortestPame(n) for the shortest formulas in postfix 
(Reverse Polish) notation]. 

Asymptotics 

The well-known asymptotics for Ca(n) = (2n)!/(n!(n -|- 1)!) can be easily derived from Stirling's 
formula, yielding -^4"'n~^/^. It is much harder to derive the asymptotics for Camin) and Came{n) 
rigorously, but using procedure Zinn of ArithFormulas, we get the following non-rigorous estimates 

Cam{n) X cin-3/2(4.077...)" , 

C„„e(n)>^C2n-3/2(4.131...)" , 

for some constants Ci,C2. 

The Book of Minimal Formulas 

To get the enumeration (up to n = 40), and a list of optimal-length formulas for n from 2 to 8000, 
generated by procedures Sef erAM(Kl ,K2) and Sef erAME(Kl ,K2) (with Kl = 40, K2 = 8000) 
for formulas using only addition and multiplication and for formulas also using exponentiation, 
respectively, see the two webbooks 

http : //www . math . rutgers . edu/~zeilberg/ tokhniot/oArithFormulasl , 
http : //www .math . rutgers . edu/~zeilberg/tokhniot/ oArithFormulas2 . 

These minimal expressions are listed in postfix notation, ready to be entered into a Reverse Polish 
Calculator (available on-line, e.g. http://www.alcula.com/calculators/rpn/, viewed March 1, 
2013). They are given in the most memory-efficient way (using procedure MinMemory) so as to 
minimize the number of memory locations (stack-size) needed, i.e. realizing the Strahler number 
(see Stra in ArithFormulas). 

We also have analogous procedures for using addition and exponentiation (i.e. no multiplication). 
The output is presented in the following webbook 

http : //www .math . rutgers . edu/~zeilberg/tokhniot/oArithFormulas3 . 
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Conclusion 



In addition to the great intrinsic interest of this project-what can be more natural or fundamental 
than expressing integers?-it is also a case study in using Experimental Mathematics to enu- 
merate, randomly generate, and optimally generate, combinatorial objects. We believe that the 
same methodology could be applied to Boolean formulas and even Boolean circuits, that would 
shed yet another angle on the central problem of theoretical computer science, the notorious P vs. 
NP problem. So far most of the work was done by humans, using pencil-and-paper. It is about 
time that computers will put some effort towards settling the most central problem of their field, 
or at the very least, give some empirical and experimental insight about it. 
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